Goto

Collaborating Authors

 classifier system


Adapting Rule Representation With Four-Parameter Beta Distribution for Learning Classifier Systems

Shiraishi, Hiroki, Hayamizu, Yohei, Hashiyama, Tomonori, Takadama, Keiki, Ishibuchi, Hisao, Nakata, Masaya

arXiv.org Artificial Intelligence

Rule representations significantly influence the search capabilities and decision boundaries within the search space of Learning Classifier Systems (LCSs), a family of rule-based machine learning systems that evolve interpretable models through evolutionary processes. However, it is very difficult to choose an appropriate rule representation for each problem. Additionally, some problems benefit from using different representations for different subspaces within the input space. Thus, an adaptive mechanism is needed to choose an appropriate rule representation for each rule in LCSs. This article introduces a flexible rule representation using a four-parameter beta distribution and integrates it into a fuzzy-style LCS. The four-parameter beta distribution can form various function shapes, and this flexibility enables our LCS to automatically select appropriate representations for different subspaces. Our rule representation can represent crisp/fuzzy decision boundaries in various boundary shapes, such as rectangles and bells, by controlling four parameters, compared to the standard representations such as trapezoidal ones. Leveraging this flexibility, our LCS is designed to adapt the appropriate rule representation for each subspace. Moreover, our LCS incorporates a generalization bias favoring crisp rules where feasible, enhancing model interpretability without compromising accuracy. Experimental results on real-world classification tasks show that our LCS achieves significantly superior test accuracy and produces more compact rule sets. Our implementation is available at https://github.com/YNU-NakataLab/Beta4-UCS. An extended abstract related to this work is available at https://doi.org/10.36227/techrxiv.174900805.59801248/v1.


Fuzzy-UCS Revisited: Self-Adaptation of Rule Representations in Michigan-Style Learning Fuzzy-Classifier Systems

Shiraishi, Hiroki, Hayamizu, Yohei, Hashiyama, Tomonori

arXiv.org Artificial Intelligence

This paper focuses on the impact of rule representation in Michigan-style Learning Fuzzy-Classifier Systems (LFCSs) on its classification performance. A well-representation of the rules in an LFCS is crucial for improving its performance. However, conventional rule representations frequently need help addressing problems with unknown data characteristics. To address this issue, this paper proposes a supervised LFCS (i.e., Fuzzy-UCS) with a self-adaptive rule representation mechanism, entitled Adaptive-UCS. Adaptive-UCS incorporates a fuzzy indicator as a new rule parameter that sets the membership function of a rule as either rectangular (i.e., crisp) or triangular (i.e., fuzzy) shapes. The fuzzy indicator is optimized with evolutionary operators, allowing the system to search for an optimal rule representation. Results from extensive experiments conducted on continuous space problems demonstrate that Adaptive-UCS outperforms other UCSs with conventional crisp-hyperrectangular and fuzzy-hypertrapezoidal rule representations in classification accuracy. Additionally, Adaptive-UCS exhibits robustness in the case of noisy inputs and real-world problems with inherent uncertainty, such as missing values, leading to stable classification performance.


An Approach to Analyze Niche Evolution in XCS Models

Lanzi, Pier Luca

arXiv.org Artificial Intelligence

We present an approach to identify and track the evolution of niches in XCS that can be applied to any XCS model and any problem. It exploits the underlying principles of the evolutionary component of XCS, and therefore, it is independent of the representation used. It also employs information already available in XCS and thus requires minimal modifications to an existing XCS implementation. We present experiments on binary single-step and multi-step problems involving non-overlapping and highly overlapping solutions. We show that our approach can identify and evaluate the number of niches in the population; it also show that it can be used to identify the composition of active niches to as to track their evolution over time, allowing for a more in-depth analysis of XCS behavior.


Recognizing Hand-Printed Letters and Digits

Neural Information Processing Systems

We are developing a hand-printed character recognition system using a multi(cid:173) layered neural net trained through backpropagation. We report on results of training nets with samples of hand-printed digits scanned off of bank checks and hand-printed letters interactively entered into a computer through a sty(cid:173) lus digitizer. Given a large training set, and a net with sufficient capacity to achieve high performance on the training set, nets typically achieved error rates of 4-5% at a 0% reject rate and 1-2% at a 10% reject rate. The topology and capacity of the system, as measured by the number of connections in the net, have surprisingly little effect on generalization. For those developing practical pattern recognition systems, these results suggest that a large and representative training sample may be the single, most important factor in achieving high recognition accuracy.


On Effectively Predicting Autism Spectrum Disorder Using an Ensemble of Classifiers

Twala, Bhekisipho, Molloy, Eamon

arXiv.org Artificial Intelligence

An ensemble of classifiers combines several single classifiers to deliver a final prediction or classification decision. An increasingly provoking question is whether such systems can outperform the single best classifier. If so, what form of an ensemble of classifiers (also known as multiple classifier learning systems or multiple classifiers) yields the most significant benefits in the size or diversity of the ensemble itself? Given that the tests used to detect autism traits are time-consuming and costly, developing a system that will provide the best outcome and measurement of autism spectrum disorder (ASD) has never been critical. In this paper, several single and later multiple classifiers learning systems are evaluated in terms of their ability to predict and identify factors that influence or contribute to ASD for early screening purposes. A dataset of behavioural data and robot-enhanced therapy of 3,000 sessions and 300 hours, recorded from 61 children are utilised for this task. Simulation results show the superior predictive performance of multiple classifier learning systems (especially those with three classifiers per ensemble) compared to individual classifiers, with bagging and boosting achieving excellent results. It also appears that social communication gestures remain the critical contributing factor to the ASD problem among children.


Separating Rule Discovery and Global Solution Composition in a Learning Classifier System

Heider, Michael, Stegherr, Helena, Wurth, Jonathan, Sraj, Roman, Hähner, Jörg

arXiv.org Artificial Intelligence

The utilization of digital agents to support crucial decision making is increasing in many industrial scenarios. However, trust in suggestions made by these agents is hard to achieve, though essential for profiting from their application, resulting in a need for explanations for both the decision making process as well as the model itself. For many systems, such as common deep learning black-box models, achieving at least some explainability requires complex post-processing, while other systems profit from being, to a reasonable extent, inherently interpretable. In this paper we propose an easily interpretable rule-based learning system specifically designed and thus especially suited for these scenarios and compare it on a set of regression problems against XCSF, a prominent rule-based learning system with a long research history. One key advantage of our system is that the rules' conditions and which rules compose a solution to the problem are evolved separately. We utilise independent rule fitnesses which allows users to specifically tailor their model structure to fit the given requirements for explainability. We find that the results of SupRB2's evaluation are comparable to XCSF's while allowing easier control of model structure and showing a substantially smaller sensitivity to random seeds and data splits. This increased control aids in subsequently providing explanations for both the training and the final structure of the model.


Classifying the Unstructured IT Service Desk Tickets Using Ensemble of Classifiers

C, Ramya, P, Paramesh S., S, Shreedhara K

arXiv.org Artificial Intelligence

Manual classification of IT service desk tickets may result in routing of the tickets to the wrong resolution group. Incorrect assignment of IT service desk tickets leads to reassignment of tickets, unnecessary resource utilization and delays the resolution time. Traditional machine learning algorithms can be used to automatically classify the IT service desk tickets. Service desk ticket classifier models can be trained by mining the historical unstructured ticket description and the corresponding label. The model can then be used to classify the new service desk ticket based on the ticket description. The performance of the traditional classifier systems can be further improved by using various ensemble of classification techniques. This paper brings out the three most popular ensemble methods ie, Bagging, Boosting and Voting ensemble for combining the predictions from different models to further improve the accuracy of the ticket classifier system. The performance of the ensemble classifier system is checked against the individual base classifiers using various performance metrics. Ensemble of classifiers performed well in comparison with the corresponding base classifiers. The advantages of building such an automated ticket classifier systems are simplified user interface, faster resolution time, improved productivity, customer satisfaction and growth in business. The real world service desk ticket data from a large enterprise IT infrastructure is used for our research purpose.


Deep Learning with a Classifier System: Initial Results

Preen, Richard J., Bull, Larry

arXiv.org Artificial Intelligence

This article presents the first results from using a learning classifier system capable of performing adaptive computation with deep neural networks. Individual classifiers within the population are composed of two neural networks. The first acts as a gating or guarding component, which enables the conditional computation of an associated deep neural network on a per instance basis. Self-adaptive mutation is applied upon reproduction and prediction networks are refined with stochastic gradient descent during lifetime learning. The use of fully-connected and convolutional layers are evaluated on handwritten digit recognition tasks where evolution adapts (i) the gradient descent learning rate applied to each layer (ii) the number of units within each layer, i.e., the number of fully-connected neurons and the number of convolutional kernel filters (iii) the connectivity of each layer, i.e., whether each weight is active (iv) the weight magnitudes, enabling escape from local optima. The system automatically reduces the number of weights and units while maintaining performance after achieving a maximum prediction error.


Evolving Multi-label Classification Rules by Exploiting High-order Label Correlation

Nazmi, Shabnam, Yan, Xuyang, Homaifar, Abdollah, Doucette, Emily

arXiv.org Machine Learning

In multi-label classification tasks, each problem instance is associated with multiple classes simultaneously. In such settings, the correlation between labels contains valuable information that can be used to obtain more accurate classification models. The correlation between labels can be exploited at different levels such as capturing the pair-wise correlation or exploiting the higher-order correlations. Even though the high-order approach is more capable of modeling the correlation, it is computationally more demanding and has scalability issues. This paper aims at exploiting the high-order label correlation within subsets of labels using a supervised learning classifier system (UCS). For this purpose, the label powerset (LP) strategy is employed and a prediction aggregation within the set of the relevant labels to an unseen instance is utilized to increase the prediction capability of the LP method in the presence of unseen labelsets. Exact match ratio and Hamming loss measures are considered to evaluate the rule performance and the expected fitness value of a classifier is investigated for both metrics. Also, a computational complexity analysis is provided for the proposed algorithm. The experimental results of the proposed method are compared with other well-known LP-based methods on multiple benchmark datasets and confirm the competitive performance of this method.


SupRB: A Supervised Rule-based Learning System for Continuous Problems

Heider, Michael, Pätzel, David, Hähner, Jörg

arXiv.org Artificial Intelligence

We propose the SupRB learning system, a new Pittsburgh-style learning classifier system (LCS) for supervised learning on multi-dimensional continuous decision problems. SupRB learns an approximation of a quality function from examples (consisting of situations, choices and associated qualities) and is then able to make an optimal choice as well as predict the quality of a choice in a given situation. One area of application for SupRB is parametrization of industrial machinery. In this field, acceptance of the recommendations of machine learning systems is highly reliant on operators' trust. While an essential and much-researched ingredient for that trust is prediction quality, it seems that this alone is not enough. At least as important is a human-understandable explanation of the reasoning behind a recommendation. While many state-of-the-art methods such as artificial neural networks fall short of this, LCSs such as SupRB provide human-readable rules that can be understood very easily. The prevalent LCSs are not directly applicable to this problem as they lack support for continuous choices. This paper lays the foundations for SupRB and shows its general applicability on a simplified model of an additive manufacturing problem.